A Measure Of Aggregate Syntactic Distance

نویسندگان

  • John Nerbonne
  • Wybo Wiersma
چکیده

We compare vectors containing counts of trigrams of part-of-speech (POS) tags in order to obtain an aggregate measure of syntax difference. Since lexical syntactic categories reflect more abstract syntax as well, we argue that this procedure reflects more than just the basic syntactic categories. We tag the material automatically and analyze the frequency vectors for POS trigrams using a permutation test. A test analysis of a 305,000 word corpus containing the English of Finnish emigrants to Australia is promising in that the procedure proposed works well in distinguishing two different groups (adult vs. child emigrants) and also in highlighting syntactic deviations between the two groups.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Measuring Syntactic Variation in Dutch Dialects

This research applies dialectometric methods to purely syntactic dialect data. It will be shown that there is geographic cohesion in syntactic variation when viewed in the aggregate. The amount of syntactic variation which can be accounted for by geography will be determined. Dialectometric techniques will be used to develop an additive measure of syntactic differences. Multidimensional scaling...

متن کامل

A new vector valued similarity measure for intuitionistic fuzzy sets based on OWA operators

Plenty of researches have been carried out, focusing on the measures of distance, similarity, and correlation between intuitionistic fuzzy sets (IFSs).However, most of them are single-valued measures and lack of potential for efficiency validation.In this paper, a new vector valued similarity measure for IFSs is proposed based on OWA operators.The vector is defined as a two-tuple consisting of ...

متن کامل

Generalized Aggregate Uncertainty Measure 2 for Uncertainty Evaluation of a Dezert-Smarandache Theory based Localization Problem

In this paper, Generalized Aggregated Uncertainty measure 2 (GAU2), as a newuncertainty measure, is considered to evaluate uncertainty in a localization problem in which cameras’images are used. The theory that is applied to a hierarchical structure for a decision making to combinecameras’ images is Dezert-Smarandache theory. To evaluate decisions, an analysis of uncertainty isexecuted at every...

متن کامل

The Relationship between Syntactic and Lexical Complexity in Speech Monologues of EFL Learners

: This study aims to explore the relationship between syntactic and lexical complexity and also the relationship between different aspects of lexical complexity. To this end, speech monologs of 35 Iranian high-intermediate learners of English on three different tasks (i.e. argumentation, description, and narration) were analyzed for correlations between one measure of sy...

متن کامل

Detecting Syntactic Substratum Effects Automatically in Interlanguage Corpora

This paper applies techniques to obtain an aggregate measure of syntactic distance between two varieties of English spoken by firstand second-generation Finnish Australians and examines the degree of what we call syntactic ‘contamination’ in the two. Our general goal is to detect the linguistic sources of the variation between the two groups and interpret the findings from (at least) two perspe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006